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Abstract 

We propose a simple universal (that is, distribution-free) stegano- 
graphic system in which covertexts with and without hidden texts 
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Q I are statistically indistinguishable. The stegosystem can be applied 

to any source generating i.i.d. covertexts with unknown distribution, 
and the hidden text is transmitted exactly, with zero probability of 
^ i error. Moreover, the proposed steganographic system has two impor- 

tant properties. First, the rate of transmission of hidden information 
• approaches the Shannon entropy of the covertext source as the size of 

. blocks used for hidden text encoding tends to infinity. Second, if the 

\^ I size of the alphabet of the covertext source and its minentropy tend to 

■ infinity then the number of bits of hidden text per letter of covertext 

c/5 . tends to log(n!)/n where n is the (fixed) size of blocks used for hidden 

text encoding. The proposed stegosystem uses randomization. 



^ 1 Introduction 



The goal of steganography is as follows. Alice and Bob can exchange messages 
of a certain kind (called covertexts) over a public channel which is open to 
Eve. The covertexts can be, for example, photographic images, videos, text 
emails and so on. Alice wants to pass some secret information to Bob so 
that Eve can not notice that any hidden information was passed. Thus, 
Alice should use the covertexts to hide the secret text. It is supposed that 
Alice and Bob share a secret key. A classical illustration from [llj states the 
problem in terms of communication in a prison: Alice and Bob are prisoners 
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who want to concoct an escape plan passing each other messages which can 
be read by a ward. 

Perhaps the first formal approach to steganography was taken by Cachin 
m 12] who proposed a steganographic protocol in which, relying on the fact 
that the probability distribution of covertexts is known, covertexts with and 
without hidden information are statistically indistinguishable. In the same 
work a universal (distribution-free) steganographic system was proposed, in 
which this property holds only asymptotically with the size of the messages 
going to infinity, and which has exponential complexity of coding and decod- 
ing. Distribution-free stegosystems are of particular practical importance, 
since in reality covertexts can be graphical images, ICQ or email messages, 
that is, sources for which the distribution is not only unknown but perhaps 
can not be reasonably approximated. Later a complexity-theoretic approach 
for (distribution- free) steganography was developed in |S1[I21, where stegosys- 
tems were proposed in which covertexts with and without hidden information 
are indistinguishable in polynomial time. 

We use the following model for steganography, mainly following [2]. It is 
assumed that Alice has an access to an oracle which generates independent 
and identically distributed covertexts according to some fixed but unknown 
distribution /i. Covertexts belong to some (possibly infinite) alphabet A. 
Alice wants to use this source for transmitting hidden messages. A hidden 
message is a sequence of letters from B = {0, 1} generated independently 
with equal probabilities of and 1. We denote the source of hidden messages 
by u. This is a commonly used model for the source of secret messages since 
it is assumed that secret messages are encrypted by Alice using a key shared 
only with Bob. If Alice uses the ideal Vernam cipher then the encrypted 
messages are indeed generated according to the Bernoulli 1/2 distribution, 
whereas if Alice uses modern block or stream ciphers then the encrypted 
sequence "looks like" a sequence of random Bernoulli 1/2 trials. Here to 
"look like" means to be indistinguishable in polynomial time, or that the 
likeness is confirmed experimentally by statistical data, known for all widely 
used cyphers; see, e.g. jSHU]. The third party. Eve, is reading all messages 
passed from Alice to Bob and is trying to determine whether secret messages 
are being passed in the covertexts or not. Observe that if covertexts with 
and without hidden information have the same probability distribution (/x) 
then it is impossible to distinguish them. 

In the universal system proposed in [2] the hiddentext sequence is divided 
into blocks of a certain size m each of which corresponds to a block of length 
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n{m) of covertext letters from A. The distribution of resulting covertext 
letters tends to the (unknown) distribution /i (of covertexts without hidden 
information) as n tends to infinity. It is important to note that the conver- 
gence is not uniform (on the set of all possible distributions with A fixed), 
and also the memory size of coder and decoder grows exponentially with n. 

We propose a simple universal stegosystem for which covertexts with and 
without hidden information have the same distribution (and hence are sta- 
tistically indistinguishable) for any size of the message. The hidden text is 
transmitted correctly with probability 1. Moreover, the proposed system has 
two important properties. First, the rate of transmission of hidden infor- 
mation approaches the Shannon entropy of the covertext source as the size 
n of blocks used for hidden text encoding tends to infinity. Second, if the 
size of the alphabet of the covertext source and its minentropy tend to in- 
finity then the number of bits of hidden text per letter of covertext tends to 
log(?T,!)/r;, where n is the (fixed) size of blocks used for hidden text encod- 
ing. The latter property is, in particular, an advantage as compared to the 
complexity-theory based stegosystems proposed in [T21 Ej for which the 
rate of hidden text transmission is no more than a constant per covertext 
letter. We note that it is also possible to use the proposed stegosystems for 
open-key steganography in a standard way. 

The paper is organized as follows. In Section |21 a simple stegosystem 
which does not use randomization is proposed; for this system the number 
of bits of hidden text per letter of covertext tends to 1/2 if the size of the 
alphabet of the covertext source and its minentropy tend to infinity. This 
system also illustrates the main ideas used in Section El where the general 
(randomized) stegosystem is proposed which has the mentioned asymptotic 
properties of the rates of hidden text transmission. In Section 0] we dis- 
cuss possible extensions of the proposed steganographic systems and outline 
some potentially interesting open problems. In particular, we discuss is- 
sues concerning stegosystems based on a common set of data and open-key 
steganography. 
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2 A simple non-randomized universal 
stegosystem 

In this section we present a very simple stegosystem which demonstrates the 
main ideas used in the general stegosystem which we develop in the next sec- 
tion. The stegosystem described in this section does not use randomization. 

The notation is as follows. The source /i draws i.i.d. (covertext) letters 
from an alphabet A. The source u draws i.i.d. (hidden, or secret) equiproba- 
ble letters from the alphabet B = {0, 1}. Finite groups of (covertext, hidden, 
secret) letters are sometimes called (covertext, hidden, secret) words. Ele- 
ments of A (B) are usually denoted by x (y). 

First consider an example. Consider a situation in which not only the se- 
cret letters are drawn (using u) from a binary alphabet, but also the source of 
covertexts fi generates symbols from the alphabet A = {a, b} (not necessarily 
with equal probabilities). Suppose that Alice has to transmit the sequence 
y* = ?/i2/2 • • • generated according to u and let there be given a covertext 
sequence x* = xiX2 ■ ■ ■ generated by /i. For example, let 

y* = 01100 . . . , X* = aababaaaabbaaaaabb .... (1) 

The sequences x* and y* are encoded in a new sequence X (to be transmitted 
to Bob) such that y* is uniquely determined by X and the distribution of X 
is the same as the distribution of x* (that is, /i; in other words, X and x* 
are statistically indistinguishable). 

The encoding is carried out in two steps. First let us group all symbols 
of X* into pairs, and denote 

aa = u, bb = u, ab = vq, ba = Vi. 

In our example, the sequence ((T)) is represented as 

X* = aa ba ba aa ab baaaaabb - ■ ■ = uviViUVqViUUU . . . 

Then X is acquired from x* as follows: all pairs corresponding to u are 
left unchanged, while all pairs corresponding to Vk are transformed to pairs 
corresponding to Vy^^Vy^Vy^ . . . ; in our example 

X = aaab ba aa ba ab aa aa bb.... 
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Decoding is obvious: Bob groups the symbols of X into pairs, ignores all 
occurrences of aa and hb and changes ah to and 6a to 1. 

The properties of the described stegosystem, which we call St2, are sum- 
marized in the following (nearly obvious) statement. 

Claim 1. Suppose that a source generates i.i.d. random variables taking 
values in A = {a, b} and let this source be used for encoding secret messages 
consisting of a sequence of i.i.d. equiprobable binary symbols using the method 
St2. Then the sequence of symbols output by the stegosystem obeys the same 
distribution /i as the input sequence. 

We will not give the (obvious) proof of this claim since it is a simple 
corollary of Theorem Q below. 

It is interesting to note that a similar construction was used by von Neu- 
mann in his method for obtaining a sequence of equiprobable binary symbols 
(see IHIIS]) from a sequence of independent flips of a biased coin. His method, 
as well as the just described stegosystem, was based on the fact that the 
probabilities of ah and ha are equal. 

Next we consider the generalisation of the described stegosystem to the 
case of any alphabet A (such that \A\ > 1). To do this we fix some total 
ordering on the set A. As before, Alice has to transmit a sequence y* = 
yiy2 ■ ■ . generated by the source u of i.i.d. equiprobable binary letters and 
let there be given a sequence x* = X1X2 ... of covertext letters generated 
i.i.d. according to a distribution /i on A. Again we transform the sequences 
y* and x* into a new sequence X which obeys the same distribution as x*. 
As before we break x* into blocks of length 2. If a block X2i~iX2i has the 
form aa for some a E A then it is left unchanged. Otherwise let the block 
X2i-iX2i be ah for a,h E A and suppose a < 6; if the current symbol y^ is 
then the block ah is included in X, and if = 1 then ha is included in X. 
li a > h then encode in the opposite way. To decode, the sequence is broken 
into pairs of symbols, all pairs of the form aa are ignored and a pair of the 
form ah is decoded as if a < 6 and as 1 otherwise. Denote this stegosystem 
by St2{A). 

Theorem 1. Suppose that a source /i generates i.i.d. random variables tak- 
ing values in some alphabet A. Let this source be used for encoding secret 
messages consisting in a sequence of i.i.d. equiprobable binary symbols, using 
the method St2{A). Then the sequence of symbols output by the stegosystem 
obeys the same distribution /i as the input sequence and the number of letters 
of hidden text transmitted per letter of covertext is |(1 — XlaeA f^i^)"^)- 
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Proof. Fix some a, f3 e A and k eN. We will show that 

p{X2k-iX2k = ap) = i^{ap), 

where p is the probability distribution of the output sequence. Suppose 
a < (5. Decomposing the probability on the left we get 

p{X2k-iX2k = ol(3) = u{yk = 0)(/i(a/3) + n{Pa)) 

The case (5 < a is analogous, and the case /? = a is trivial. The second 
statement is obtained by calculating the probability that letters in the block 
coincide. □ 

Note that in practice when the covertexts are, for example, graphical 
files, each covertext is practically unique (the alphabet A is potentially in- 
finite) so that the number of covertext letters (files) per one hidden bit is 
approximately 2. 



3 General construction of a universal 
stegosystem 

In this section we consider the general construction of universal stegosys- 
tem which has the desired asymptotic properties. As before, Alice needs to 
transmit a sequence y* — y\y2 ... of secret binary messages drawn by an 
i.i.d. source uj with equal probabilities of and 1, and let there be given 
a sequence of covertexts x* = XiX2 ■ ■ ■ drawn i.i.d. by a source ^ from an 
alphabet A. First wc break the sequence x* into blocks of n symbols each, 
where n > 1 is a parameter. Each block will be used to transmit several sym- 
bols from y* (for example, in the previously constructed stegosystem St2{A) 
each block of length 2 was used to transmit 1 or symbols). However, in the 
general problem arises which was not present in the construction of 

St2{A). Namely, wc have to align the lengths of the blocks of symbols from 
X* and from y*, and for this we will need randomization. The problem is 
that the probabihties of blocks from y* are divisible by powers of 2, which is 
not necessarily the case with blocks from x*. 
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We now present a formal description. Let u denote the first n symbols of 
X*: u = Xi . . . Xn, and let z/„(a) be the number of occurrences of the symbol 
a in u. Define the set Su as consisting of all words of length n in which the 
frequency of each letter a e A is the same as in 

{v e -.ya e A i/„(a) = ^'u(a)}. 

Observe that the /^-probabilities of all members of Su are equal. Let there 
be given some ordering on the set Su (for example, lexicographical) which is 
known to both Alice and Bob (and to anyone else) and let Su — {sq, Si, . . . 
,S|5^l_i} with this ordering. 

Denote m = [log2|5'„|J, where [yj stands for the largest integer not 
greater than y. Consider the binary expansion of 15*^1: 

where = 1, ctj e {0, 1} , m > j' > 0. In other words, 

\Su\ = 2™ + a„_i2— ^ + a„_22"*-2 + ... + a^. 

Define a random variable A as taking each value i e {0,1,..., m} with 
probability ai2'^/\Su\ ■ 

p{A^i) = aiT/\Su\. (2) 

Alice, having read m, generates a value of the random variable A, say d, and 
then reads d symbols from y*. Consider the word r* represented by these 
symbols as an integer which we denote by r. Then we encode the word r* 
(that is, d bits of y*) by the word from the set where 

m 
l=d+l 

(In other words, the word Sr is being output by the coder.) 

Then Afice reads the next n-bit word, and so on. Denote the constructed 
stegosystem by Stn{A). 

To decode the received sequence Bob breaks it into blocks of length n and 
repeats all the steps in the reversed order: by the current word u he obtains 
Su and r, then d (clearly d is uniquely defined by r), r and r*; that is, he 
finds \r*\ next symbols of the secret sequence y*. 
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Consider an example which illustrates all the steps of the calculation. Let 
A = {a,b,c}, n = 3, u = bac. Then = {abc, acb, bac, bca, cab, cba}, \Su\ = 
6,m = 2,^2 = l,cn = l,ao = 0. Let the sequence of secret messages be 
0110..., that is, y* = 0110... . Suppose the value of A generated by Ahce is 1. 
Then she reads one symbol of y* (in this case 0) and calculates r = 0, r* = 
0, r = 2^ + = 4 and finds the codeblock S4 = cab. To decode the message. 
Bob from the block cab calculates r = 4, r = 0, r* = and finds the next 
symbol of the secret sequence — 0. 

Theorem 2. Let a source /i be given, which generates i.i.d. random variables 
taking values in some alphabet A. Let this source be used for encoding secret 
messages consisting of a sequence of i.i.d. equiprobable binary symbols using 
the described method Stn{A) with n > 1. Then 

(i) the sequence of symbols output by the stegosystem obeys the same dis- 
tribution fi as the input sequence, 

(a) the average number of secret symbols per covertext (Ln) satisfies the 
following inequality 



where fi{u) is the ^-probability of the word u and //^(a) is the number 
of occurrences of the letter a in the word u. 

Proof. To prove the first statement it is sufficient to show that for any cover- 
text word u of length n its probability of occurrence in the output sequence is 
1/|S'„|. This follows from (0) and the fact that letters in y* are independent 
and equiprobable. 

The second statement can be obtained by direct calculation of the average 
number of symbols from y* encoded by one block. Indeed, from (j2|) we find 
that for each covertext word u the expected number of transmitted symbols 
is S/=i ^"^'2^ > \Su\ — 2, where m = [\og2\Su\\, and for each word u we 



Let us now consider the asymptotic behaviour of L„ when n ^ 00. 

Corollary 1. // the alphabet A is finite then the average number of hidden 
symbols per letter Ln goes to the Shannon entropy h{fi) of the source fi as n 
goes to infinity; here by definition h{fi) = — J^aeAf^i'^) loS/^('^)- 
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Proof. This statement follows from a well-known fact of Information Theory 
which states that for each 6 > and n — oo the following inequality holds 
with probability 1 

h{^) - 5 < log \Su\/n < h{fi) + 6, 

see, e.g. □ 

In many real stegosystems the alphabet A is huge (it can consist, for 
example, of all possible digital photographs of given file format, or of all 
possible e-mail messages). In such it is interesting to consider the 

asymptotic behaviour of L„ with fixed n when the alphabet size \A\ goes to 
infinity. For this we need to define the so-called min-entropy of the source /x: 

ifoo(/i) = min{- logyu(a)} . (4) 

Corollary 2. Assume the conditions of Theorem{^ and fix the block length 
n>l. If \A\ ^ oo so that HooifJ') —* oo then Ln tends to {log{n\) —0{l))/n. 

This statement simply follows from the fact that the number of different 
permutations of n elements is n\. 

Next we briefly consider the resource complexity of the stegosystem Stn{A). 
To store all possible words from the set Su would require memory of order 
2" log I y4 1 bits, which is practically unacceptable for large n. However, if we 
use the algorithm for fast enumeration from jTHI, then we can find the index 
of a block Sr given r (encoding) and vice versa (decoding) using 0(log'^''"^*n) 
operations per symbol and 0{n\og^ n) bits of memory. 

4 Discussion 

We have proposed two stegosystems (with and without randomization) for 
which the output sequence of covertexts with hidden information is statisti- 
cally indistinguishable from a sequence of covertexts without hidden infor- 
mation. The proposed stegosystems rely heavily on the assumption that the 
oracle generates independent and identically distributed covertexts. This is 
perhaps a reasonable assumption if covertexts are graphical images of a cer- 
tain kind, but if, for example, we want to use just one image to transmit (a 
large portion of) a secret text then our covertexts are parts of the image, 
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which are clearly not i.i.d. How to extend the ideas developed in this work 

to the case of non-i.i.d. covertexts is perhaps the main open question. 

However, the main idea that was used in the proposed stegosystems is 
that for any block of covertexts it is possible to find several other blocks 
which have the same probability as the original one; then hidden information 
can be encoded in the number of a block in this group. This idea can be 
extended to the case of non-independent covertexts. Indeed, suppose that on 
the current step of transmission we known that some covertexts have equal 
probabilities to appear as the next generated covertext. That is, among the 
conditional (given the current history) probabilities of covertexts there are 
several groups of equal probabilities. Then, if the probabihty of the next 
generated covertext belongs to one of these groups, we can use this covertext 
(possibly replacing it with another one which has the same probability) for 
encoding several next bits of hiddentext in the same fashion as it is done in 
Stn{A). The same apphes to blocks of covertexts. Indeed the only feature of 
independently and identically distributed covertexts that we used was that 
all permutations within a block of size n have equal probabilities. So the next 
step is to identify equal conditional probability groups in sources of non-i.i.d. 
covertexts. 
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